In this second exercise-notebook we will play with Convolutional Neural Network (CNN).
As you should have seen, a CNN is a feed-forward neural network tipically composed of Convolutional, MaxPooling and Dense layers.
If the task implemented by the CNN is a classification task, the last Dense layer should use the Softmax activation, and the loss should be the categorical crossentropy.
Reference: https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py
A simple CNN, with one input branch and one output branch can be defined using a Sequential model and stacking together all its layers.
In this exercise we want to build a (quite shallow) network which contains two [Convolution, Convolution, MaxPooling] stages, and two Dense layers.
To test a different optimizer, we will use AdaDelta, which is a bit more complex than the simple Vanilla SGD with momentum.
In [2]:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Flatten, Activation
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import Adadelta
input_shape = (3, 32, 32)
nb_classes = 10
## [conv@32x3x3+relu]x2 --> MaxPool@2x2 --> DropOut@0.25 -->
## [conv@64x3x3+relu]x2 --> MaxPool@2x2 --> DropOut@0.25 -->
## Flatten--> FC@512+relu --> DropOut@0.5 --> FC@nb_classes+SoftMax
## NOTE: each couple of Conv filters must have `border_mode="same"` and `"valid"`, respectively
## your code here
In [ ]:
# %load solutions/sol_223.py
An important feature of Keras layers is that each of them has an input_shape
attribute, which you can use to visualize the shape of the input tensor, and an output_shape
attribute, for inspecting the shape of the output tensor.
As we can see, the input shape of the first convolutional layer corresponds to the input_shape
attribute (which must be specified by the user).
In this case, it is a 32x32
image with three color channels.
Since this convolutional layer has the border_mode
set to same
, its output width and height will remain the same, and the number of output channel will be equal to the number of filters learned by the layer, 16.
The following convolutional layers, instead, have the default border_mode
, and therefore reduce width and height by $(k-1)$, where $k$ is the size of the kernel.
MaxPooling layers, instead, reduce width and height of the input tensor, but keep the same number of channels. Activation layers, of course, don't change the shape.
In [4]:
for i, layer in enumerate(model.layers):
print ("Layer", i, "\t", layer.input_shape, "\t", layer.output_shape)
In the same way, we can visualize the shape of the weights learned by each layer. In particular, Keras lets you inspect weights by using the get_weights
method of a layer object. This will return a list with two elements, the first one being the weight tensor and the second one being the bias vector.
Of course, MaxPooling layer don't have any weight tensor, since they don't have learnable parameters. Convolutional layers, instead, learn a $(n_o, n_i, k, k)$ weight tensor, where $k$ is the size of the kernel, $n_i$ is the number of channels of the input tensor, and $n_o$ is the number of filters to be learned. For each of the $n_o$ filters, a bias is also learned. Dense layers learn a $(n_i, n_o)$ weight tensor, where $n_o$ is the output size and $n_i$ is the input size of the layer. Each of the $n_o$ neurons also has a bias.
In [5]:
for i, layer in enumerate(model.layers):
if len(layer.get_weights()) > 0:
print("Layer", i, "\t", layer.get_weights()[0].shape, "\t", layer.get_weights()[1].shape)
We will train our network on the CIFAR10 dataset, which contains 50,000
32x32 color training images, labeled over 10 categories, and 10,000 test images.
As this dataset is also included in Keras datasets, we just ask the keras.datasets
module for the dataset.
Training and test images are normalized to lie in the $\left[0,1\right]$ interval.
In [6]:
from keras.datasets import cifar10
from keras.utils import np_utils
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
To reduce the risk of overfitting, we also apply some image transformation, like rotations, shifts and flips. All these can be easily implemented using the Keras Image Data Generator.
In [7]:
from keras.preprocessing.image import ImageDataGenerator
generated_images = ImageDataGenerator(
featurewise_center=True, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=True, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.2, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.2, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
generated_images.fit(X_train)
Now we can start training.
At each iteration, a batch of 500 images is requested to the ImageDataGenerator
object, and then fed to the network.
In [10]:
X_train.shape
Out[10]:
In [11]:
gen = generated_images.flow(X_train, Y_train, batch_size=500, shuffle=True)
X_batch, Y_batch = next(gen)
In [12]:
X_batch.shape
Out[12]:
In [ ]:
from keras.utils import generic_utils
n_epochs = 2
for e in range(n_epochs):
print('Epoch', e)
print('Training...')
progbar = generic_utils.Progbar(X_train.shape[0])
for X_batch, Y_batch in generated_images.flow(X_train, Y_train, batch_size=500, shuffle=True):
loss = model.train_on_batch(X_batch, Y_batch)
progbar.add(X_batch.shape[0], values=[('train loss', loss[0])])